Problem is that it takes a long time to brute force search the entire hyper parameter space for optimal settings 2 situations: 1. **Number of trials > Number of hyperparameters; —> ML** - Training time is not exorbitantly long, we can afford to have a large no. of trials. - Build a quantum inspired algorithm to maneuver the hyperparameter space and quicken the optimisation. - take fewer trials of exploring the combinations of hyper parameters to find the optimal setting 2. **Number of trials < Number of hyperparameters, —> DL** - Training time is much longer, days / weeks.. Cannot afford to experiment many trials. - Boost our capacity by running many trials, and instead of training all trials for the full number of epochs, build an algorithm that weeds our the least promising trial candidates early during training to reallocate computing resources to other more promising ones where each trial is the evaluation of a set of hyperparameters. ### Hyper Parameter Tuning Algorithms ![[Pasted image 20210917131326.png]] TPE (Bayesian Optimization) > Random Search > Grid Search [[Grid Search and Random Search]] [[Bayesian Optimisation]] [[Asynchronous Successive Halving Algorithm (ASHA)]] [[Population Based Training]]